Identifying Basic Patterns of Korean Natural Language
نویسندگان
چکیده
Korean natural language queries are composed of a number of basic building blocks. This paper describes the process to identify the basic patterns considered to be basic building blocks constructing Korean queries. Two sets of Korean queries generated by two groups of senior-level students were experimented. Questions from the rst set were produced by students who had no knowledge about databases and schema. Students from the second group attended a short lecture and understood that the Korean queries would be executed by a computer. By analyzing these experimental queries, seven basic patterns are identi-ed. Korean queries combined by these basic patterns cover more than 80% of all questions.
منابع مشابه
The Machine Translation Researches and Governmental View in Korea
Viewed from a broad perspective, in the seventies when we studied the basic technologies of NLP as a groundwork of MT, the focus of research was given to describing various phenomena of the Korean language in a linguistically significant way and processing the Korean characters mathematically or specific phenomena of the language logically with a computer. The theoretical linguistic description...
متن کاملSegmentation Granularity in Dependency Representations for Korean
Previous work on Korean language processing has proposed different basic segmentation units. This paper explores different possible dependency representations for Korean using different levels of segmentation granularity — that is, different schemes for morphological segmentation of tokens into syntactic words. We provide a new Universal Dependencies (UD)-like corpus based on different levels o...
متن کاملText Mining: Extraction of Interesting Association Rule with Frequent Itemsets Mining for Korean Language from Unstructured Data
Text mining is a specific method to extract knowledge from structured and unstructured data. This extracted knowledge from text mining process can be used for further usage and discovery. This paper presents the method for extraction information from unstructured text data and the importance of Association Rules Mining, specifically for of Korean language (text) and also, NLP (Natural Language ...
متن کاملA Constrained Finite-State Morphotactics for Korean
Abstract In this paper, we propose a constrained finite-state model, named cfsm, for Korean morphotactics and attempt to show how it can successfully treat some major morphological problems in Korean. As a preliminary descriptive framework, this model adopts the Korean morphological system Komor by Lee (1999) to lay out some basic problems in Korean morphotactics and describe linear approaches ...
متن کاملA Comparison of Two Variant Corpora: The Same Content with Different Source
Abstract In order to investigate the effect of source language on translations, we investigate two variants of a Korean translation corpus. The first variant consists of Korean translations of 162,308 Japanese sentences from the ATR BTEC (Basic Expression Text Corpus). The second variant was made by translating the English translations of the Japanese sentences into Korean. We show that the sou...
متن کامل